Setup

## **** __Utilized Cores__ **** = 2$subsetGenes
## [1] "protein_coding"
## 
## $subsetCells
## [1] 500
## 
## $resolution
## [1] 0.6
## 
## $resultsPath
## [1] "./Results"
## 
## $nCores
## [1] 2
## 
## $perplexity
## [1] 30
## [1] "Written using: Seurat version* 2.3.4 2018-07-17"
## [1] "Seurat 3.1.0"
## [1] "monocle3 0.1.2"
## [1] "garnett 0.2.3"

Preprocessing

Seurat/2.3.4 to Monocle3

Also subset to only protein-coding genes.

## Subsetting genes: protein_coding
## Error in biomaRt::getBM(attributes = c("hgnc_symbol", "gene_biotype"), : object 'DAT' not found
## Error in eval(lhs, parent, parent): object 'biotypes' not found
## Error in subset_seurat(DAT, genes.use = proteins): object 'DAT' not found
## Error in Seurat::UpdateSeuratObject(object = DAT): object 'DAT' not found
## Error in levels(proteins): object 'proteins' not found

Clustering

  • Unsupervised clustering of cells is a common step in many single-cell expression workflows. In an experiment containing a mixture of cell types, each cluster might correspond to a different cell type. - This function takes a cell_data_set as input, clusters the cells using Louvain community detection, and returns a cell_data_set with internally stored cluster assignments.
  • In addition to clusters this function calculates partitions, which represent superclusters of the Louvain communities that are found using a kNN pruning method. Cluster assignments can be accessed using the clusters function and partition assignments can be accessed using the partitions function.

  • Using only the topN variable genes to cluster

Louvain Clustering

Pseudotime

## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
## Warning in louvain_clustering(data, pd[row.names(data), ], k = k, weight = weight, : RANN counts the point itself, k must be smaller than
## the total number of points - 1 (all other points) - 1 (itself)!

## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Transformation introduced infinite values in continuous y-axis

UMAP Plots

Disease status (dx), mutation status (mut), and individual ID (ID) show good mixture across clusters.

dx

mut

ID

Cell-type Identication

Garnett

Use a pre-trained classifier from Pline et al.

## Loading required package: AnnotationDbi
## 
## Attaching package: 'AnnotationDbi'
## The following object is masked from 'package:plotly':
## 
##     select
## The following object is masked from 'package:dplyr':
## 
##     select
## 
##  [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT" 
##  [5] "ENSEMBLTRANS" "ENTREZID"     "ENZYME"       "EVIDENCE"    
##  [9] "EVIDENCEALL"  "GENENAME"     "GO"           "GOALL"       
## [13] "IPI"          "MAP"          "OMIM"         "ONTOLOGY"    
## [17] "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
## [21] "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"      
## [25] "UNIGENE"      "UNIPROT"
## Warning in doTryCatch(return(expr), name, parentenv, handler): The
## following genes used in the classifier are not present in the input CDS.
## Interpret with caution. ENSG00000174059The following genes used in the
## classifier are not present in the input CDS. Interpret with caution.
## ENSG00000007062The following genes used in the classifier are not present
## in the input CDS. Interpret with caution. ENSG00000157404The following
## genes used in the classifier are not present in the input CDS. Interpret
## with caution. ENSG00000185291

Gene Expression Plots

Violin

## Warning: Transformation introduced infinite values in continuous y-axis

## Warning: Transformation introduced infinite values in continuous y-axis

## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 26788 rows containing non-finite values (stat_ydensity).
## Warning: Removed 26788 rows containing non-finite values (stat_summary).

Cell Type Proportions

  • Check the proportion of of cells within each disease group that belong to cluster. This will help determine whether any PD vs. Control DEGs are reflective of cell proportion differences.

dx

## [1] "There is a 5.22 % difference in the number of Cluster 1 cells ( Canonical Monocytes ) in Controls compared to PD."
## [1] "There is a -4.67 % difference in the number of Cluster 2 cells ( Intermediate Monocytes ) in Controls compared to PD."

mut

individual

Save Checkpoint

Save R object and run memory-intensive DGE analyses on computing cluster.